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LATENT FACTOR MODELS AND ANALYSES 
FOR OPERATOR RESPONSE TIMES 



D. P. Caver 

Department of Operations Research 
Naval Postgraduate School 
Monterey, CA 93943 

L G. O’Muircheartaigh 
University College 
Galway, Ireland 



0. INTRODUCTION, BACKGROUND, AND SUMMARY 

There arc many situations in which an operator (single individual, or group 
or crew) is confronted with a somewhat complex task that must be accomplished 
within prescribed time limits. The task actually often initially requires diagnos- 
tic steps followed by action. In some cases the diagnostic steps are stimulated 
by a cue event, leading to probing actions intended to reveal the correctness 
of a tentative diagnosis, followed by observation and interpretation of system 
response, in turn followed by viewpoint revision and further action. While it 
is intriguing to attempt to model response in such detailed terms, this paper 
does not embark on that enterprise. Rather, we provide and analyze models for 
the resulting overall response time of different operators to different tasks where 
response is initiated by one or more cues provided by the system. Two factor- 
analytic models are presented along with likelihood estimation procedures. The 
latter are then employed to analyze data sets from typical exercises conducted 
at simulators used for training nuclear power plant operators; their identities 
are kept anonymous. (The findings of the model are critiqued, and applications 
to risk analysis are sketched.) 
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\l is boliovod tliat similar rnodds will be useful for summarizing the behavior 
of oj)erators or cr^‘ws in other situations, both military and otherwise. For 
example, aj)j)liration to military tank driver performance is envisioned. 
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1. A MIXED OR LATENT FACTOR MODEL WITH THE 
UNEQUAL CASE FIXED EFFECTS AND VARIANCES 
(LOG N MODEL). 

Consider this linear model of mixed type: 

= /^ + i = .../ . . 

k = 1,2,. ../C ^ ^ 

where = luTiit with being the time for crew i to respond to situation k\ 
f^i and Uk are fixed constants (effects), and u;, ~ IlDN(0,aJ), eik ~ IIDN(0, a^.), 
are, respectively, the latent random component that “individualizes” case (in- 
dividual, crew, etc.) i, and the random variation displayed by any individual 
on situation (task, problem, etc.) k\ It is assumed that each case occurs in 
conjunction with each situation (e.g, a person confronts a particular problem) 
just once in the data set to be modelled. In practical circumstances, some such 
individual interactions may be missing for reasons unrelated to individuals and 
situations, a problem that is deferred for the present; see Appendix A and B. 

As implied, the model described may well be of interest when data pertaining 
to human performance are to be analyzed, but should also be of use elsewhere. 
The K tasks or items are allowed to have their own fixed response properties, 
described by this pair will be referred to as a task signature. The usual 

mixed ANOVA model formulation assumes aj. = , constant for all k (see 

Scheffe (1959)), as is reasonable when measurement error is represented. 

Note that because of the assumption of possibly unequal cr^’s a fixed-u;, model 
cannot be usefully estimated by likelihood. Consequently the above random- 
effect model has been introduced, and fitted to data. As will be apparent, it 
is possible to estimate the posterior density for cj, using Bayes’ formula in the 
style of empirical Bayes; the mean of the resulting Normal/Gaussian density 
I data, log N model], is available as an estimate of the ith crew effect if so 
desired. 

That the above setup is a latent factor model has been remarked to us 
by Professor T. W. Anderson; see Anderson (1988) Chapter 11 for relevant 
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rovorago in tlio NormaJ/Caiissian case, [irilling^'r and Preisler (1983) survey 
various oth(*r latent factor clata-analytical studies in nori-Gaussian settings; this 
includes a detailed discussion of a latent factor Poisson model for counting data. 
Brillinger’s paper is interesting in that it suggests examining goodness of model 
fit by “uniform residuals,” a procedure considered in our study as well. 

Fitting the LOG N model by likelihood requires iterative calculations; the 
setup is described in the next section. In case one wishes to ‘‘robustify” the 
formulation, perhaps by introducing more outlier-prone specifications such as the 
Student t or Tukey density for then more numerical effort, or approximation 
is required. Use of the Laplace approximation together with Gauss-Hermite 
integration may well turn out to be useful; see Gaver and O’Muircheartaigh 
(1987), and Gaver, Jacobs and O’Muircheartaigh (1990). In a later section a 
totally non-Normal/Gauss model for operator response times is introduced and 
fitted. 
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2. FITTING THE LOG-NORMAL MODEL BY MAXIMUM 
LIKELIHOOD 

In the model (1) the individualizing case effect, cj, for case i is viewed as 
a latent or unobserved rv whose effect on the Yn^ observable is indirect. What 
is the probability distribution of = 1,2,... A' in terms of the unknown 

parameters? Clearly it is multivariate normal since cj* and occur as a sum; 
the density for case (crew) i is, by conditional independence, given lJi , 



^=1 









K 

e Jt=i 

K 

^=1 



To obtain the unconditional density of remove the condition on cj,: 

ly(y.U‘.a) = /“ ^1. 



(2) 



( 3 ) 



The calculation needed (“completion of the square” in the exponent) can be 
expeditiously performed as follows. Recognize that the exponent is quadratic in 
u; put 



1 fiD,-OjY 1 . 1 ^ , , 2,2 

~2 ( — —) ~2 Y {y>k - - I'k - U)) jol, ( 4 ) 

A', being independent of u>. To find cJ*, r^, and A', differentiate (2.3) re u and 
equate coefficients of u; and 1: 

= )K. (5) 

^ k=\ 



SO, from the cj-term, 



Jt = l 



( 6 ) 
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wliilo from tlu* 1 tf'rrn 



K 

^xlr^ - (!/.fc - /i - I'k) l^l, (7) 

A:=l 

giving 

K 

u, ^ (y,fc - /i - t'fc) = 3/i- - (/^ + k>.) (8) 

fc=l 

where Wk = (l/<^) = (l/<^*) /E/=i ^.re thus tVjt-weighted 

averages. Substitution into (4) gives 
K 

= Y - /i - !ol 

kzz\ 



K r 



= E 



^=1 L 



Vxk - (i- ^'k ~Y 



fc=i 



U4 



(9) 



K 

= Y “ y') ~ 



A:=l 



Now from (2) 







duj^ 



a familiar convolution, from which 



(10) 



fY,{y,uixL,4,4) = 



g- 5 A'.g-i(u;,)^/(T^+< 7 j) 



Thus the likelihood of ft,E,n?,4 is proportional to 



v/t -2 + (t 2 

2 ^2 



r. 



(11) 



i- {^^,E,4,4xy) = n 



— e“2^*e“ 



( 12 ) 



1=1 



n v’’^+^u 

^fc=i 
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and hence the log-likelihood, /, may be expressed as 



2 j/) = / Inr^- 



ln(rHa2)-^ A'. + 

k-\ 1=1 t = l 

Differentiation gives these estimates: 



di 



dii'k - i'-) 



= ^ = tl “ y^-) - - 1^.)] (i/<^fc) > 



t=l 



so 



Next, 



so 






= j E ^'•*-7 E E y^kWk 

' 1=1 ;i=i 



1=1 

= {y.k-y-) 



01 ^ 1 . 

= 0 = ^ E - (/' + '"•)) 



d(/< + 1^.) 



+ (t2 



i=l 



(i+u. = y.. 



(13) 



(14) 



(15) 



(16) 

(17) 



These of course closely resemble conventional ANOVA estimates. 

When estimating variances it is convenient to reparameterize in terms of 
precision: pk = l/<^^ P = 1/r^ = ELi = ELi Pi- 

Then 



dl 

Opk 






p Pk l/p + <^3 






(18) 



where 



and 



Next 



^lik) = j2[iy^k-y..)-{y.k-y-)? 

1 = 1 

= E(^*)^ • 



i=l 



=0 = -/-^-.+ 






(r2 + a2)2’ 



(19) 

(20) 
( 21 ) 
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yields 



~^ol = Uuf = \ Y. ( 22 ) 

inlrodiicUon of (21) into (18) simplifies the latter to 

-i = lAn^-) + -, fc= (23) 

Pk t P 

The system (15), (16), (22) and (23) must be solved iteratively. Begin by simply 
fitting as if cr^ = to estimate /i(l), //fc(l),w(l), and obtain 

, / 

(2'.*-/‘(l)-'>*(l)-‘^.(l))' (24) 

^ ^ 1 = 1 

from which compute 1)) / Next calculate — 

i/.)t(2) — y.,{2) using Wk{2) in (15), and /i+i/(2) = y..(2) from (16). It is now 
possible to evaluate A](k\2) from (19), and (o;)^(2) from (20), and hence Pk{2) 
and ;5(2) from (23), after which cr^(2) from (22). Now recompute Wk{3) = 
{\/al{2)) / = PA:(2)/p( 2), and so repeat iteratively until conver- 

gence is achieved. A solution procedure based on Newton-Raphson iteration 
has also been obtained; agreement of the two procedures is generally good. 
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3. LOG-EXTREME- VALUE MODEL (THE LOG EV MODEL) 

An alternative to the previous model that may be attractive is the following 
setup: 



(a) Tik is distributed according to a two-parameter VVeibull; then it follows 
mathematically that 



(b) Yifc = In Tik has the extreme-value distribution: 

1 — exp [—0; exp {(yik — with probability density function 






exp[-0, exp((j/a- 



r]k)/^k)]0^(^xp{{ya - Vk)/^k)-^ 

a 



(25) 



Note the occurrence of parameter 0,, which is intended to represent crew 
effect, i.e. is a way of individualizing crews comparable to the action of lJi in 
the previous model. Values of 0^ are viewed as randomly selected latent factors 
as were the a;, values. The nature of the contribution differs from a;, in this 
model: whereas in the LOG N model u;, acted purely additively (on the log scale) 
to affect the center (mean of logged response times) in a manner common to all 
tasks, in the LOG EV model it can be seen that logged times are represented as 



= m + ^k[{-\ne,) + €,,i (26) 

€tk having standardized extreme value df. For the present model, (25) or (26), 

E [Y^k\0^] = ru - 0.5772^^ - In 0, (27) 

2 

= 1.6449^1 (28) 

which permit initial parameter estimation by moments and facilitates compari- 
son with the results of alternative models. Expression (26) implies that responses 
to tasks are affected differentially: the greater the natural variation in perform- 
ing a task by crews (measured by for task A:) the greater the ‘"average” effect 
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on task duration (hi(! to crow olfoct. This is a sjjocific form of interaction be- 
tween crew and task effects that may (or may not) be reasonable in particular 
ri rriiiustaiir(‘s. 

OjnditionaJ on 6^, tin* crew Ts response, on K difrerent tasks has joint 
density function 

K 

(k,; <9,), (29) 

k=l 

where conditional independence is assumed. In order to obtain the unconditional 
joint density of response remove the condition on 0, by integrating out; this 
step corresponds to (10). ddius 






(30) 



where 



and 



K 

c, = ^ cxp{{y,k - Vk)/^k) 
k=\ 



(I, = exp ( ^ {y,k - r]k)/^k 



k=\ 



n (1/^^) 



k=l 



( 31 ) 



(32) 



The above model closely resembles one introduced by Crowder (1985) and 
Crowder and Kimber (1989). However, ours deals with the log time, and hence 
is a location scale model that more closely compares to the additive log-normal 
model, although the r/jt is not generally a mean, nor is ^ standard deviation. 



4. GAMMA VARIATION FOR 6 

A search for mathematical tractability suggests that variation in 6 be de- 
scribed by a gamma density: 



6 ^ e 



r(i//?) 




(33) 
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so E[0] = 1 and V ar[0] = /?, and from which the joint density of observations 
by crew i is 






r(K + i/p)f 1 



f— V 

V 1 + /3ci ) 



r(i//J) 

The likelihood associated with / independent crews is 



rd,. 



. ..I - (iin±mi3K\ ' A '' + ' 



= ( 



)n(nW) 



or 



r(i//?) 

/ = InL = //v'ln/? + 71n(r(A' + l//?)/r(l//?)) 

/ / 

— ( A + 1//?) ^ ln( 1 + /ic, ) + ^ In 



t=i 



1=1 



(34) 



d, (35) 



(36) 



After arrangement and re-parameterization so that 4>^ = \n^k the log-likelihood 
becomes 

A'-l / / A' A' 

k=0 t = l t = J fc = J Jl- = 1 

(37) 
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5. FITTING THE LOG-EXTREME VALUE (LOG-EV) MODEL 



BY MAXIMUM LIKELIHOOD 



'I’o ol)lain tlio maximum likelihood estimates of the parameters we iteratively 
solve the following equations, for which k= 1,2, throughout; 

dl/OP = 0, 

dl/drik = 0, (38) 

dl/d<l>k = 0. 



One Newton-Haphson iteration only is applied to each equation, after which 
the entire process is repeated until convergence. Typically, only two or three 
repetitions are required. 

We record the derivatives needed for the above process. 



E. 

op 



I I 

-(A' + l/P) x; (c,/(l + Pc,)) + (\/P^) ^ ln(l + Pc,) 



1=1 



1=1 



A'-l 

-I UPii + f^P) 



k=Q 



dl 

dt]k 



/ 

{Kp + l)exp(-</>0 ^ +Pc,)~ Ie~^k 

1 = 1 



(39) 

(40) 



where r,k = {y,k - %)/6> a residual; finally 



dl 

04>k 



(A'/3 + l)E 



1=1 




(41) 
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The second derivatives are 



q2i / / 

^ = /A7/?2 + (A- + 1//J) x: + /?c.)2 + (2//J2) ^ c./(l + ^c.) 

i=i 1=1 



I K-\ 

-(2//?")X: ln(l+/?c.)-(///?^) x: (A:+l//3) 

t = l it=0 



-2 



A-1 

k-Q 



(42) 



^ = (1 + A'/3)e-2<^'^ ^ {(e^-V(l+/?A)) [(/3e^‘V(l+M))-l), (43) 

^ = (1 + A'/?) ^ |/?(r,fce^'*-7(l + ^c,))^ - e'''*’T,fc(l + r,fc)/(l + /?c,)| 

/ 

+ E (44) 

1 = 1 

In order to use the inverse of the Fisher information matrix to provide stan- 
dard errors of the parameter estimates all cross-partial derivatives are required; 
we omit recording these in the interest of brevity; the expressions may be ob- 
tained from the authors. Our numerical experience has been that standard 
errors obtained from Fisher information tend to be too small, as judged from 
bootstrapping approaches next to be described. 



6. BOOTSTRAPPING 



A modern alternative for obtaining standard errors and approximate confi- 
dence limits is the parametric bootstrap of Efron (1979; esp. Remark K, p.25). 
This procedure has recently been applied to failure data in the context of the 
Challenger disaster by Dalai, Fowlkes and Hoadley (1988) and goes as follows: 
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put 0 • • .^A ) ”' N,aii(l {P\ rii,...,TiK\ 4>i , ■ ■ ■ ,4 >k) 

in Mo(l<‘l L(K; lev. Nolo that 0 = (Oi,...,0p) here donotes a generic parameter; 
it Ix'ars no direct relation to the crew effect in our LOG EV model, (25). 

1'lion our procodure is this: 

(a) Kstiinato 0 from data; the result is 0(0), the point estimate of the param- 
('ters. 

(b) Provisionally adopt 0(0) as the true value in the parametric model, in the 
|)resent case (I) or (25). 

(c) Simulate B independent data sets (bootstrap samples) from the model 

evaluated at 0(0): {^^A :(^)7 ^ = 1> • • • ^ ^ 2, . . . , A'; b = 1 , 2, . . . , /i}. 

(d) Compute estimates of 0 for each sample, obtaining the bootstrap estimates 
{d{h)J) = 1,2, . . . , /i) = {0(/i)), the bootstrap distribution of 0. 

(e) Present relevant statistical summaries of marginal and joint distributions 
of (0(i?)}: e.g. use as standard error of 0(0) components the corresponding 
standard deviations of the bootstrap estimate; use as confidence limits 
upper and lower jiercent points of the bootstrap sampling distributions, 
suitably adjusted. We present numerical illustrations in the next section. 

(f) The same procedure can evaluate standard errors of, and confidence limits 
for predictions from data: in the present case prediction of the probability 
that a response time exceeds any given value is evaluated in terms of the 
model evaluated repeatedly at bootstrap parameter estimate values; see 
Dalai (t (il (1988 ) for an example. See Section 9 for an example in the 
present context. 

Bootstrapping methods suggest themselves for comparing the adequacies of 
different models for fitting and predicting from specific data sets. Specifically, 
bootstraj)ping may assist in choosing between two, or more, candidate models. 
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In the present setting one may wish to predict the probability of non-success, 
i.e., of response time exceeding some time window of duration P(i\ 0). Models 
A and B (e.g. our LOG N and LOG EV options) are estimated obtaining 0^(0) 
and ^^(0). Then generate bootstrap samples for A and B using ^^(0) and 
0^(0) respectively, resampling to estimate the mean-squared error of prediction 
when Model i is used to predict, given that the data comes from Model j; here 
= {A, A; A,Z?; B,D; B,A}. Prefer the model whose use minimizes the 
maximum estimated mean-squared error of prediction. An alternative strategy 
is to prefer prediction from the most conservative model: the one predicting the 
greatest risk; see Section 9 and Draper, Hodges, et al (1987). 

Another option for residual examination is to compute estimates of the ex- 
pected log response times associated with each Task/Crew combination. Since 
crew effects are random, we estimate them in specific cases by their posterior 
means. 

For the LOG N model, examination of (10) reveals that the posterior density 
of is N [(u?,/r^) / (1/r^ + l/a^) , 1 / (1/^"^ + 1/^^)] • We substitute in the 
mle’s for the various parameters to estimate in a particular case: 

w. = (w,7(r)^) / (l/r^ + l/cr^) 
and then from (1), (8), (15), (16) 

Vik = ifi+K) + = y.. + (y.k - y..) + [(?/.. - |/..)/(rf] / (l/(f2) + l/^^] 

(45) 

where all averages are suitably weighted. Note that the above formula for the 
mean acts, in effect as if a preliminary hypothesis test for homogeneity of crews 
is being applied: if is very small, giving evidence that all crews are the same, 
then the estimate ytk — y.k^ (weighted) task mean for each crew. On the 
other hand if is very large then y^^ ~ yj^ + (t/t. - y..), the task mean modified 
by the estimated effect for crew i. The effectiveness of such a smooth transition 
when pooling data was noted by Mosteller (1947). 
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I'or tlio L(KJ KV rriodol tako the expectation of (27) with respect to the 
(estimated) posterior density of 0,, which is Garnma(c, + \/(3^K + 1//3) from 
(33) and (3d). Using the first two terms of the asymptotic expansion we find 



IJik 



= /■^[y,k] 

= 7% - 0.5772^\ - ik [in (a' + l/p) - In (c. + 



\/p) -0.!y/{K + 1/0) 



6 ) 
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7. EXAMPLE DATA ANALYSES 



The previous models have been used to analyze response time data from sim- 
ulator experiments involving operators performing certain safety -related tasks. 
In Tables 1 and 2 appear actual data from two such: System L and System D. 
It is noted that certain task-crew combinations are missing, some because of 
simulator failure. Two procedures were adopted for dealing with these cases: 
(1) values were imputed by an EM-like process; see Little and Rubin (1987); 
alternatively, (2) a likelihood approach was taken that simply omits such values 
from the analysis by setting to unity the likelihood contribution associated with 
a cell having a missing response. Both approaches can be useful; the former 
leans more heavily on model correctness. The analyses reported here emphasize 
the use of a simple imputation procedure; the incomplete data results are also 
reported in the summary tables. 

In addition to “true” missing values there are observations, here marked 
with asterisks, that were judged to result from operators following non-standard 
response strategies. It was judged to be useful to analyze the data both with, 
and without, including such values; when omitted, those nonentries in the data 
table were treated as missing values, entries imputed as above or treated as 
missing values, and analyses made using LOG N and LOG EV models. 

Results of fitting, along with bootstrapped standard errors, appear in Tables 
3 and 4 for System L, and Tables 5 and 6 for System D. Both tables exhibit 
main effects (/^ + ^ii^d scale parameters (ak^nd^k) for LOG N/EV mod- 

els computed under A: missing values, and non-standard strategy values, both 
treated as literally missing, using methods of the Appendix, and alternatively 
with entries imputed, and B: only the “true” missing values treated as missing. 
The imputation process used was iteration based on standard two-way ANOVA 
with fixed task and crew effects. This is a convenient crude approximation to a 
proper EM algorithmic approach. Little and Rubin (1987). In addition, random 
crew effect variance parameters, aj and j3 respectively for the two models, were 
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(*sti mated, d'lie resampling- refitting parametric bootstrap supplied the standard 
errors; see Efron (1979) and D^Jal, Fowlkes, and Iloadley (1988). 

It is noted that for System L the fixed effects (fi+t^k,Vk} under A and B agree 
riosc'ly, with the exce[)tion of Tasks 2 and 9. Data for Task 9, in Table 1, exhibits 
8 out of 18 missing values, and a further 4 non-standard strategy values, all of 
which are far in excess of other times for that task. Data for Task 2 also exhibit 
substantially many missing values and non-standard times, the latter having 
resulted from operators following non-standard procedures and hence yielding 
times more lengthy than the other, acceptable, values. The noticeable differences 
are, however, still within 2 bootstrap standard errors. The corresponding scale 
effects (e.g., log task time standard deviations) for Tasks 2 and 9 behave in 
corresponding fashion, increasing by factors of 2 to 3 if the non-standard times 
are included. 

Similar behavior occurs for System D, although here the exceptions occur 
for Tasks 4 and 8. There are fewer missing and non-standard times reported 
for System D than for System L. Relatively large changes occur in the standard 
errors, as well as in main effect and scale parameter estimates, when several 
missing or non-standard times are encountered, and these are treated differently 
in the an;ilysis. 

In order to check for the effect of imputation, and also for that of apparent 
correlations between certain task times the analyses were re-run for System L 
omitting Tasks 2, 4, and 9. The results appear in Tables 7 and 8. Although 
specific numerical values are changed, the general pattern remains quite similar: 
the new numbers are quite often well within a standard error of the estimates 
that utilize data from all tasks. 
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8. MODEL CRITICISM VIA RESIDUAL EXAMINATION 



In order to examine the overall fit of the models to data it is useful to con- 
duct some form of residual analysis. We have chosen first to judge the overall 
degree of fit by computing and summarizing uniform residuals as described by 
Brillinger and Preisler (1983). In general for this procedure one estimates model 
parameters 0, from data and then examines the estimated probabibty integral 
transformation of the data, utilizing the fitted model: if the model is correct 
then the latter should closely resemble the uniform distribution. In Figures 1 
through 8 we display plots and summary statistics for such estimated proba- 
bility integral transforms of the present data set. We also exhibit the result 
of bootstrapping once: each model was allowed to create one set of bootstrap 
sample data utilizing the fitted parameter values; these values were then treated 
like raw data, and were then probability integral transformed and the results 
plotted and summarized. 

The results have different implications for the appropriateness of the models 
for the two data sets. The left uniform plot of Figure 1 shows decided non- 
uniformity of residuals when the raw data is fitted by LOG N using the methods 
of the Appendix; however, if a bootstrap sample is generated using the fitted 
model parameters the results are far more uniform. This strongly suggests that 
the basic model is inappropriate. A similar implication is obtained by examining 
Figure 3, the residuals of which are associated with LOG EV. Figures 2 and 4 
describe the residuals when imputation of missing and/or non-standard values 
is conducted. Notice that uniformity of residuals of the fit of the raw (plus 
imputed) data is greatly enhanced. This is not surprising since imputation is 
based on presumption of model correctness, and the missing and non-standard 
values are imputed using the presumed model. The same general behavior is 
observed when data from System D are fitted by the two models in various ways. 
Here, howev^er, the departure of the residuals from uniformity, as shown on the 
left-most residual plots of Figures 5 and 7, seems less pronounced than is the 
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fjuso for Systom L. Again, imputation improves the uniformity of the residuals 
and the aj)par<‘nt fit. We conclude that the log-additive models are more likely 
to be trustworthy for system 1) than for System L. 
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9. MODIFICATION OF THE MODELS FOR INITIAL DELAY 



In the nuclear plant simulator exercises, and doubtless for other applications 
as well, certain task response times may begin after a cue different from, and 
J later than, the actual initiating event. That is, the operational cue that triggers 
response may occur some time after a possible original cueing event (e.g., the 
first evidence of nuclear plant abnormality). Unfortunately, the time of the op- 
erational cue is not usually recorded in simulator practice, and so response time 
data, which may use initiating event time or some other well-specified event 
time for reference, tends to exhibit an initial delay. Such delays may be inferred 
from plant and initiating event information, or by examining the actual response 
time data. Note that ignoring such delays if they are appreciable, i.e., fitting 
a 2-parameter LOG N or LOG-EV model when a 3-parameter specification is 
more appropriate can importantly change the estimated parameter values, par- 
ticularly the LOG N ajt or LOG EV the measures of within-task variability. 
Specifically, if the delay is ignored and our current models employed uncritically 
when a delay > 0 is required then the estimated values, and will be 
biased downwards (under-estimated), sometimes quite significantly. 

To illustrate the effect of a rough accounting for delay examine the Task 10 
data for system L in Table 1. The minimum value is 1402 and the maximum 
is 3450, so there is an appreciable delay associated with beginning the task 
as compared to the variability of the response times. If the data is taken at 
face value and our LOG N and LOG EV models fitted, then Table 3 exhibits 
^10 - 0.28 considerably smaller than that for other tasks. If, however, a rough 
adjustment is made for delay by subtracting 1000 from each Task 10 response 
time data value then the simple standard deviation estimate for logged (times- 
1000) for Task 10 becomes 0.59, far more similar to other task response time 
standard deviations. 

At present the LOG N and LOG EV programs fit 2-paraineter models, so 
accounting for delay, 7^., must be done off-line. A formal approach to the es- 
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tirnatioii prohlcm for Uio VVoibulI inodol is given by Smith and Naylor (1987); 
that paper also references other relevant articles. 
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10. RISK CALCULATIONS 

An innportant application of the LOG N and LOG EV models is to risk 
analysis: it is desired to estimate the probability that a task’s response time 
occurs within a particular time window, for in that case (some aspect of) the 
threat has been averted. We will refer to the probability that response time 
exceeds the time window as the human interaction risk associated with the task. 

It is easy to make point estimates of the required probabilities using both 
♦ existing models: one simply replaces the model parameters associated with the 
task of interest by their maximum bkelihood estimates. A natural way in which 
to handle crew effect is simply to remove the crew condition lj ox 6 by integrating 
it out with respect to the appropriate Normal/Gauss or Gamma estimated prior. 
In order to assess the effect of a particular crew on the risk it is necessary to 
calculate the posterior density for that crew and then integrate out on cj or 0 with 
respect to that posterior. It may well be of interest to compare the estimated 
risk as it depends upon which crew is in place when an initiating event occurs 
in order to assess the effect of individual crews directly on risk. This is not done 
here. 
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11. RISK UNCERTAINTY 

In order to assess the the uncertainty inherent in the risk estimation one may 
once again bootstrap. The procedure is this: 

(a) Estimate d from data to obtain 0Q- 

LOG N : 00 = 

LOGEV: eo = iP,v,i^) 

(b) Provisionally adopt 6q as the true value in the parametric model. 

(c) Simulate B independent data sets from the model with parameter Oq : 

with/>= 

(d) Estimate 6 from each sample of (c). Obtain {0{b)^ 6= 1,2,...,/^}; 

(e) Compute P{T.A: > u;jt|^(6)} = P{ln Tk > In yJk\^{b))\ 

= where 






In 



The risk associated with Task k estimated from bootstrap sample 6(6 = 
1, 2 , . . . , /y) for each model, that is, calculate the probability that the time 
window is exceeded using the b^^ set of parameters estimated. For our 
two models and b = 1 , 2 ,..., i?), and also 6 = 0, the original estimate, thus 
becomes, respectively, for the two models. 



fl^-(6;LOG N)= 1 - ^ 



/ w[ - ifi{b) + 

\ + dl(b) j 



(47) 



/?fc(/»;LOG EV)= (n-/3(6)oxp(«4- 7)^(6)) /a6)) 



(48) 
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The following table exhibits risk calculation for prescribed windows; the lengths 
chosen are for illustration only. Calculations have been made both without non- 
standard strategy values (A) and including non-standard strategy values (B); 
any missing or left-out values were imputed as before. 

Examination of the point estimates in Table shows that there is considerable 
similarity in the orders of magnitude of the risks given by LOG N and LOG EV. 
However, LOG EV values are consistently below those for LOG N, as are the 
corresponding confidence limits. It is, thus, more conservative to adopt LOG N 
model-based risk estimates than to use those based on LOG EV, or equivalently 
the Weibull model. The exception. Task 9, is probably traceable to the many 
missing values exhibited, 

(f) Present statistical summaries of marginal and joint distribution of Rk{b)- 
e.g., use as standard error for the original risk estimate, R^{0)^ the stan- 
dard deviation 



Note that the overall risk associated with a particular initiating event de- 
pends upon the risks associated with all tasks (human interactions) associated 
with response to the event. These risks may well be dependent probabilities, 
at minimum because all tasks presumably confront the same crew. To handle 
the dependency induced by a common crew one can assume conditional inde- 
pendence, multiply risks conditional on crew to obtain the risks associated with 
joint events, and then integrate out with respect to the crew’s posterior density. 
The calculation may be performed explicitly for the LOG EV model since a 
closed-form elementary function expression for the extreme- value survivor func- 
tion exists; no such simple calculation can be carried out for the LOG N, but 
numerical procedures are always available. 




(49) 
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Hxarnination of the results suggests considerable similarity between the risks 
calculated using the LOG N and LOG EV models. It is, however, noticeable 
that for all tasks except Tasks 3 and 9, LOG N predicts a sliglitly greater risk 
than does LOG EV, and generally with slightly larger standard error. Of course 
in most cases shown the risks associated with the window values in our example 
are too high to be realistic; the windows were chosen for illustration only. 

3'he above calculations have been carried out using parameter values ob- 
tained under imputation. Thus the apparent similarity of risks across models is 
probably overstated, and, in view of the suspicion cast on the fits to system L 
data (see Section 8) we should treat the System L risks with caution. 
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12. SUMMARY AND CONCLUSIONS 



In this article we have suggested and shown how to fit, by maximum bkeli- 
hood, two models for operator response times. The fits of the models to two sets 
of (actual) data are displayed and compared. Uncertainties are assessed by para- 
metric bootstrapping. Complications involving missing values and consequent 
lack of balance are dealt with by direct likelihood computation as well as by a 
simple form of imputation. Finally, the fitted models are applied to estimate 
the risk of exceeding (hypothetical) time windows; associated uncertainties, i.e., 
standard errors and confidence limits, are obtained by bootstrapping. 

We view this work as a pilot or feasibiUty study intended to illustrate and 
explore possibly useful approaches and methodology to an important area. Out- 
standing problems remain: the models put forward were chosen for their abilities 
to account for some aspects of the real situation and for relative tractability, but 
many other forms could be conjured up, fitted, and applied to infer risk, as de- 
fined here. It is somewhat interesting to find that risks estimated from the same 
data using the different models agree rather closely; it is not unlikely that the 
agreement will suffer if the windows are increased so as to achieve much smaller, 
and presumably more realistic risk values. 

The bootstrap standard errors and confidence limits warn that although 
the present models tend to agree in their risk assessments the uncertainty is 
still rather large, even if wrong-model or structural uncertainty is ignored; see 
Draper, Hodges et al (1987) for discussion. In order to reduce the uncertainty of 
estimation, e.g. to reduce standard error size, it is often proposed to aggregate or 
pool data, either from similar tasks in the same environment (plant), or for the 
same task across ‘‘similar” environments. Both procedures are worthy of investi- 
gation, but will be credible only if suitable adjustments are made to reduce bias. 
Adjustments can be carried out by using models resembling the types suggested 
here, possibly enhanced to include regression terms “explaining” responses in 
terms of measured and qualitative crew and plant characteristics (“performance 
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shaping factors’’ is the jargon in certain risk assessment circles). To date, for- 
mal adjustment attempts by rc'gression have been inconclusive but are a form of 
insurance that should be included, and validated to the greatest extent possible, 
if aggregation or pooling is contemplated, particularly across plants. Inter- plant 
variability may be appreciable because of variations in management philosophy 
and style. 

In general, it seems advisable to utilize bootstrapping as extensively as possi- 
ble to build an appreciation for the variabilities and uncertainties involved when 
using models. Bootstrapping that uses direct re-sampling as originally discussed, 
Kfron (1979), seems difficult or impossible for situations such as are described 
here unless vastly more data becomes available and better, more scientifically 
based, models and true replications can be employed. Consequently, use of 
parametric bootstrapping becomes necessary. However, parametric bootstrap- 
ping that consumes data from more realistic, and elaborate, models and their 
fits to simpler structures can be useful and informative. But such exercises can- 
not directly substitute for data obtained under truly operational circumstances, 
which even the best simu/rHor data can not aspire to be. 
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APPENDIX A, MAXIMUM LIKELIHOOD ESTIMATION FOR LOG 
N MODEL, WITH MISSING VALUES 

Lot = 1 if tho ohsorvatiori at (i,k) [i.e., for crew i, task k] is available; 
otherwise = 0. Then the component of likelihood associated with (Crew) i 
is 



A 






dik 



Jt = l 






7T(7k 



Y2 ^tkivtk-^^-i'k-^^tf /^i 



f /czrl 



(vS) 






Now write 



^ fitk (y,k - fi - I'k - wf • — = 



1 _ (u.. - u;)^ , ,, 

2 -t- A,. 



h-\ 



After differentiation it u, 



dxk 



(Vik - n-Vk-^) = 



(u, - u>) 



k=i 



which implies, identifying terms of order 1 and cj, and writing 



A' 

E (^tk (Vtk - /i- n jPk = 
k=l 

E (^,kPk = 1/t;^ 

k=l 



Hence = 1/ E ^,kPk 



k=\ 



A 



E (Ixk (y,k - P - k>k)Pk = yt. - p-rf £ where y,. = rf ^ d,ky,kPk 

^=> A:=l *:=:i 

and 

A 



= XI ( 2 /a- - p - n- ^xf 



Pk- 



k = \ 
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Thus re-write the likelihood as 









a 



(v/^) 

^-iK.g-iu-?/(T2 + a3)y^ 






(Tl'‘ 






Hence the likelihood assumes the form 

" ,y Ofc^=. 

Taking logarithms, we obtain 






J -2 



l = 2\nL = - Y^K, ~Yl -TT-T + ~ XI In ^ In (t;^ + , 

t = l t = l ' ^ 



t = l 



K 

c 

it=l 



I 

z 

1=1 



= -^A- 



/ 



U: 



^ + < T ,2 



O ' 



+ XI + XI 111 - XI *" 



where 






Differentiating first re /i, we obtain 




2u;, dujt \ 

?T^^j 



Hence 



^ ( ~ ^ ) _ Q 

9/. ' ^ ■ 



-E 



u;,- 






+ ^i) 



= 0 
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?A. - II fhkl^kPk 



k=\ 



{rf + ol) 



= 0 






I , I [yx. - rfIZd,kPki^k] 

t = l 1 = 1 

( Vx. -TfH dxkt^kPk ) 



Next, (lifrerentiate re i^i: 



* I 1 _ Q 

5/^/ di^i , 7-^ + J 



( QCj \ 

-<5/*: - j 



i>W, 2 j 

^ = ->■. 



Hence 



E. 

di'i 



— ^ ^ \ ^ ^ ^ (2/*/: ^x^Pk ^ ^Ik "f" ^tlPl^ ^ 2 I 2 ^tlPl 

1 = 1 I /: 



= 0 



Thus, 



- XI 1 '^fd.ipi XI rfa- (y.t - /^ - k>k - ‘^x)Pk - d,k (y,i - p - ui- u,)pi - 
1=1 I k 

Further siniplification results in 



^xTj 

rf+(Tl 



dxipi > = 



- XI I XI -dJx)Pk- (y,/ - /i - ^'/ - w.) - ^ 



t ut 
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Hence 



i// ^ ^ dll — ^ ^ dii 
I 1=1 



-'^i (y>k - ft - i^k - (^i)pk + {y,i - p-u^i) + 



t}uj. 



{r^ + aj ) 



^ -/' - n -w,)Pit + {y,i -p-u,)+ ■ 

.=1 Ik w i + 

We now difTerentiate re pi^ noting 



dpi 



E (hkPk'^ 



;d,i = -T^d,t 



^ (y>i - /i - 1'/) + jjZ {yik - p- i'k)pk 



= T^d,i |y,7 - p - t'l- T?'^ d,k {y,k - P-i^k)Pk^ 



= Tfd,i(y,i - n-ui-u,) 



OK, 

dpi 



= da {ya - p- //, - +2^ d,;t (y.it - P- I'k - <^i)Pk {-T?d,i (y,, - p - i/i-i 



= dii {y^k - p - I'k |(j/w - p - I'l -u>,)~ 2r/“ {y,k - p - I'k ~ 



Noting also that 



Ti ^ d,k [y,k - p - I'k -d;,) Pk = t?Y 1 ^y'>= -ft-t'k) Pk -w,r^ Yl =U, -U, 

k k k 



it follows that 



dht , . _ .2 

^T“ = dll (yii - p - i^i -i^t) . 

dpi 
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m_ 

dpi 



Hence, 



= ^ fl-ii iVii - p - I^i - 



2w,r? 



iVxl - p - i^i- U>x) 



(I - T^du) , 1 






{rf + ol) 

T-, ^ ' Pt {rf + al) 



dl . 

7;— = 0 => 

dpi 



Y1 {Vxi - P - - >^x) 



- , 2w,r2 

Vxl - p-Ul-U),-\- -T- 

— ^ _L rr^ 



rf + K 






irf-VolY 



-^ +- 



Pt 



Pt 






2uiTf 



(Vil - P - I'l -i^i) \ Vxl - p - ui - Ui + 



rt + < 



+ 



I I 



{rf + crlY 






rf ^(t\ 



Finally, difTcrentiating re we find 



di 



Sw + 



^5) 



= 0 



which implies 



= E 



O'? - r? 



:/E 



1 



The following iterative procedure was employed: 



(a) Use imputed values to obtain fi{0) ,crl{0) i^),Pk (0)[A: = 1,...,^^] 

(b) Solve for fi{l) using (A-1) Update u), 

(c) Solve for i>i (1) using (A-2) Update u), 

(d) Solve for pi using (A-3) Update 

(e) Solve for using (A-4) 



Repeat (b) through (e) until convergence. 



34 



REFERENCES 



Anderson, T. W. (1984) An Introduction to Multivariate Statistical Anal- 
ysis {Second Edition). New York: John Wiley. 

Brillinger, D. R., Preisler, H. K. (1983) ‘‘Maximum likelihood estimation 
in a latent variable problem.” Studies in Econometrics^ Time Series and 
Multivariate Statistics. New York: Academic Press. 

Crowder, M. (1985) “A distributional model for repeated failure time mea- 
surements.” Journal of the Royal Statistical Society, Ser. B, 47, pp. 447- 
452. 

Crowder, M. (1989). “A multivariate distribution with Weibull connec- 
tions.” Journal of the Royal Statistical Society, Ser, B, 51, No. 1, pp. 
93-107. 

Dalai, S. R., Fowlkes, E. B. and Hoadley, B. (1990) “Risk Analysis of 
the Space Shuttle: Pre-Challenger prediction of failure.” Journal of the 
American Statistical Association, 84, No. 408, pp. 945-967. 

Denby, L., and Mallows, C. (1988) “Robust analysis of mixed models.” 
AT&T Statistical Research Reports, No. 65. AT&T Bell Laboratories, 
Murray Hill, NJ. 

Draper, D., Hodges, J. S., Learner, E., Morris, C. N., Rubin, D. B. (1987). 
A Research Agenda for Assessment and Propagation of Model Uncertainty, 
Rand Note N-2683-RC. The Rand Corporation, Santa Monica, CA. 

Efron, B. (1979) “Bootstrap methods: another look at the jackknife.” 
Annals of Statistics, 7, pp. 1-26. 

Gaver, D. P., and O’Muircheartaigh, I. G. (1987). “Robust empirical Bayes 
analyses of event rates.” Technometrics, 29, No. 1, pp. 1-15. 



35 



(lavor, I). P., Jacobs, P. A., arul O’Muirclieartaigh, I. G. (1990) “Regres- 
sion analysis of hierarchical Poisson-like event rate data: superpopulation 
model effect on predictions.” (to appear in Communications on Statistics). 

Ilougaard, P. (1986) “Survival models for heterogeneous populations de- 
rived from stable distributions.” Biometrika, 73, pp. 387-396. 

Hougaard, P. (1986) “A class of multivariate failure time distributions.” 
Biornctrika., 73, pp. 671-678. 

Kimber, A. C., and Crowder, M. J. (1988) A Repeated Measurements Model 
with Applications in Psrychologrjy University of Surrey Technical Report on 
Statistics, No. 65. 

Little, R. J. A., and Rubin, D. B (1987) Statistical Analysis with Missing 
Data. New York: John Wiley and Sons. 

Mosteller, F. W. (1947) “On pooling data.” Journal of the American 
Statistical A.ssociatioriy 43, No. 242, pp. 231-242. 

Scheffe, II. (1959) The Analysis of Variance. New York: John Wiley and 
Sons. 

Smith, R. L., and Naylor, J. C. (1987) “Statistics of the three-parameter 
Weibull distributions.” Annals of Operations Research, 9, pp. 577-587. 

Watson, A. S. and Smith, R. L. (1985) “An examination of statistical 
theories for fibrous materials in the light of experimental data.” Journal 
of Materials Science, 20, pp. 3260-3270. 



36 



TABLE 1 
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TABLE 2. 

SYSTEM D TASK-CREW TIMES 
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Figures not in parentheses are estimates when no imputation is carried 
out for cases with missing values - methods described in the Appendice: 
are employed; figures in parentheses are the cor respondi ng estimates 
when values are imputed for missing cases. 
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STD. ER. 90y. CONFIDENCE INTERVAL 

UPPER LOWER 

.22(0.06) 0.A8C0.58) 0.02(0.18) 
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are enployed; figures in parentheses are the cor responding 
when values are imputed for missing cases. 
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STD. ER. 90^^ CONFIDENCE INTERVAL 

UPPER LOWER 

.03(0.06) 0.1SC0.2A) 0.01(0.05) 
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igure 1. Uniform Residuals 
System L, Model: Log N. no imputation 
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Figure 2. Uniform Residuals 
System L, Model: Log N, imputation 
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Figure 3. Uniform Residuals 

System L, Model: Log EV, no imputation 
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igure 4. Uniform Residuals 
System U Model: Log EV, imputation 
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Figure 5. Uniform Residuals 
System D, Model: Log N, no imputation 
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Figure 6. Uniform Residuals 
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Figure 7. Uniform Residuals 

System D, Model; Log EV, no imputation 
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Figure 8. Uniform Residuals 
System D, Model: Log EV, imputation 
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